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(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial 
proceedings which will directly affect or be directly affected by or have a bearing on the 
Board's decision in the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is incorrect. A correct 
statement of the status of the claims is as follows: 
This appeal involves claims 1-12. 

(4) Status of Amendments After Final 

The appellant's statement of the status of amendments after final rejection 
contained in the brief is correct. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 

(6) Grounds of Rejection to be Reviewed on Appeal 
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The appellant's statement of the grounds of rejection to be reviewed on appeal is 
substantially correct. The changes are as follows: The examiner has and will continue 
to treat Prager as the primary reference and will continue to reject the claims based on 
the teachings of Prager in view of Pugh. Thus, claims 1 -12 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Prager (US Patent Number 5,943,670, issued 
on August 24, 1999) in view of Pugh et al. (hereinafter Pugh, US Patent Number 
6,658,423, filed on January 24, 2001). 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

6,658,423 Pugh et al. 12-2003 

5,943,670 Prager 8-1999 

(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 
Claims 1-12 remain rejected under 35 U.S.C. 103(a) as being unpatentable over 

Prager (US Patent Number 5,943,670, issued on August 24, 1999) in view of Pugh et al. 

(hereinafter Pugh, US Patent Number 6,658,423, filed on January 24, 2001 ). 

Regarding independent claim 1, Prager discloses a method in which in a set of 

documents the nearest neighbors of a document are selected based on nearest 
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neighbor similarity scores (column 1 , line 55-column 2, line 42 of Prager). Prager does 
not disclose that the documents viewed to be identical are flagged as potential 
duplicates. However, Pugh discloses a method in which based on detection scores a 
document is selected as being potentially duplicate (column 7, line 26-column 8, line 28 
of Pugh). It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to have used the method of Prager with method of Pugh it would 
have allowed for duplicates to be eliminated from categories providing more accurate 
search results. 

Regarding dependent claim 2, Prager does not disclose that the documents 
viewed to be similar based on a score are flagged as potential duplicates. However, 
Pugh discloses a method in which based on detection scores (higher than a certain 
tolerance) a document is selected as being potentially duplicate (column 7, line 26- 
column 8, line 28 of Pugh). It would have been obvious to one of ordinary skill in the art 
at the time the invention was made to have used the method of Prager with method of 
Pugh it would have allowed for duplicates to be eliminated from categories providing 
more accurate search results. 

Regarding dependent claims 3 and 4, Prager discloses a method in which the 
nearest neighbor calculations, which in this case are k nearest neighbor calculations, 
are not detected for duplicate detection rather they are used to categorize the 
documents (column 1 , line 55-column 2, line 42 of Prager). 
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Regarding dependent claims 5 and 6, Prager discloses a method in which the 
documents can be text documents with visual formatting (column 4, line 34-column 5, 
line 3 of Prager). 

Regarding dependent claims 7 and 8, Prager discloses a method in which the 
documents may consist of audio presentations (column 4, line 34-column 5, line 3 of 
Prager). It would have been obvious to one of ordinary skill in the art at the time the 
invention was made that it was well known that voice recordings and musical 
performances were audio presentations. 

Regarding dependent claim 9, Prager discloses a method in which the 
documents can be images (column 4, line 34-column 5, line 3 of Prager). 

Regarding dependent claim 10, Prager discloses a method in which only k 
nearest neighbor calculations are used for similarity scores (column 1, line 55-column 2, 
line 42 of Prager). 

Regarding independent claim 11 and dependent claim 12, the claims 
incorporate substantially similar subject matter as claims 1 and 2. Thus, the claims are 
rejected along the same rationale as claims 1 and 2. 

(10) Response to Argument 

Appellant's arguments filed 8/18/2005 have been fully considered but they are 
not persuasive. 

Regarding the appellant's arguments on pages 2-4, regarding the "triangulation 
element" in claims 1 and 1 1 and whether or not it is disclosed by Prager in view of 
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Pugh, the examiner believes that the rejection is proper based on the fact that the 
combination of the references renders the claimed invention obvious. The appellant 
argues that neither Pugh nor Prager include the triangulation element, even specifically 
stating that the triangulation is used to detect duplicates in a k nearest neighbor set 
(page 2, lines 22-24, of Appellant's Brief), however it is unclear to the examiner where 
the "triangulation element" exists in claims 1 and 11. In addition to this, the examiner 
feels it necessary to point out that the "nearest neighbors" that are selected are not 
limited to "k nearest neighbors" until dependent claims 4 and/or 10 for independent 
claim 1 , and are never limited to "k nearest neighbors" for independent claim 1 1 . The 
claim states that for a document, nearest neighbors of that document are selected from 
the set of documents, these nearest neighbor candidates are then flagged as potentially 
duplicate documents if the nearest neighbor similarity scores are the same (see claim 1 
and claim 11). As it is claimed, the invention analyzes the nearest neighbors of 
document based on their similarity score and deems them as potential duplicates if they 
are identically scored. These independent claims have no additional limitations; there 
exists no mention of triangulation elements or k nearest neighbor scores in these 
claims. Prager discloses a method in which in a set of documents the nearest 
neighbors or a document are selected based on nearest neighbor similarity scores 
(column 1, line 55-column 2, line 42 of Prager), thus teaching the first element of claims 
1 and 11, in which a nearest neighbor similarity score is generated and nearest 
neighbor documents are selected based on the score. Pugh teaches that documents 
may be selected as being potentially identical based on similarity detection scores 
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(column 7, line 26-column 8, line 28 of Pugh). One of ordinary skill in the art at the time 
the invention was made would appreciate the obviousness of applying the method of 
Pugh using the nearest neighbor similarity scores of Prager, in fact one would 
appreciate the ability to use any similarity score in the teachings of Pugh because the 
teachings relied upon in this rejection are simply the ability to compare numbers and 
make a determination of whether or not documents are potentially identical based on 
those number, thus any similarity scores generated could be easily applied in the 
teachings of Pugh. Thus, as the rejection stands Prager teaches that nearest neighbor 
similarity scores are generated based on nearest neighbors of particular document (the 
first limitation of the claim), and Pugh teaches that similarity scores can be used to 
make determination of whether of not documents are potentially duplicate (the second 
limitation of the claim). 

The examiner disagrees with the appellant's arguments on pages 4-11, regarding 
the inability to combine the references due to lack of motivation. The motivation as 
shown in the previous rejection is drawn from the Pugh reference (column 7, line 56- 
column 8, line 6). This section discloses that in a group of results, near-duplicate 
documents (which includes exact duplicate documents) may be eliminated and in the 
final results presentation only one of every set of duplicate documents is presented, 
thus decreasing unwanted repetitious results and increasing the accuracy of the 
returned results and the user's ability to find the relevant information being sought 
(column 7, line 66-column 8, line 6 of Pugh). The Prager reference teaches that 
categorization results in sets of documents are determined and presented to users, 
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because categorization provides a way of grouping objects that are similar into sets, 
thus making it easier for users to navigate through the results (column 1, lines 13-55 of 
Prager). Both Prager and Pugh direct teachings toward increasing the accuracy of 
results and the user's ability to navigate to the most relevant information being sought. 
When the methods of Prager and Pugh are combined, it would increase the accuracy of 
the results as a whole by eliminating duplicates, thus increasing the accuracy of the 
categorization by eliminating duplicates from the results presented for categorization 
and increasing the user's ability to navigate to the relevant information being sought. 

The appellant also argues that the combination of the two references would 
render the primary reference unsuitable for its intended use and changing its principle of 
operation. The primary reference as cited in the rejections presented to the appellant is 
the Prager reference. The Prager reference teaches that categorization results in sets 
of documents are determined and presented to users, because categorization provides 
a way of grouping objects that are similar into sets, thus making it easier for users to 
navigate through the results (column 1 , lines 13-55 of Prager). Both Prager and Pugh 
direct teachings toward increasing the accuracy of results and the user's ability to 
navigate to the most relevant information being sought. When the methods of Prager 
and Pugh are combined, it would increase the accuracy of the results as a whole by 
eliminating duplicates, thus increasing the accuracy of the categorization by eliminating 
duplicates from the results presented for categorization and increasing the user's ability 
to navigate to the relevant information being sought. The appellant argues that because 
that Pugh reference is intended only for "large collections of documents... literally 
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billions of Web site 1 documents," that the combination would not work. However, first of 
all it is clearly stated in the Pugh reference, 'The present invention concerns information 
management and retrieval in general. More specifically, the present invention concerns 
detecting, and optionally removing, duplicate and near-duplicate information or content, 
such as in a repository of documents to be searched for example," (column 1 , lines 7-1 1 
of Pugh), is being used for the purposes of rejection and this interpretation in no way 
"renders the prior art unsatisfactory for its intended purpose." While the examiner 
agrees that the Pugh reference does discuss the use of the teachings in a web search 
engine, it is not exclusive to this fact, thus the interpretation of the reference will not be 
exclusive to this fact. The repository of documents to be searched in Pugh based on 
the interpretations of both specifications can be identical to that of the repository of 
Prager. At no point does adding the teachings of Pugh to the invention disclosed by 
Prager render the Prager reference unusable. 

In any case, the examiner would like to direct attention to the fact that the 
teachings of Prager for categorization also include the use of the teachings with internet 
search engines (column 1 , lines 22-55 of Prager). The examiner points out that only in 
one of the more advanced embodiments involving cross categorization does Prager 
view his method unusable with vast search engine systems (column 8, lines 10-16 of 
Prager), but for the simple objective of categorization (column 3, lines 8-10 of Prager) 
no problem exists. Thus even if the Pugh reference is restricted to Internet search 
engines the combination would remain proper and not deteriorate the usability of either 
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of the references. Thus, the combination of the references remains proper and 
motivation to combine exists, that is not detrimental to either reference. 

Regarding the appellant's arguments on page 1 1 , regarding claims 3 and 4 and 
whether or not Prager teaches calculating nearest neighbor scores and retaining lists of 
those scores, the examiner believes that the rejection is proper based on the cited 
portions of the Prager references. Prager teaches that K-nearest-neighbor (KNN) may 
be used to categorize the documents, and in the case of the teachings of Prager a "hit- 
list" is generated which is a list of the best-matching nearest neighbors to the document 
in question based on the similarity scores generated (column 2, lines 17-33 of Prager). 
Thus, Prager clearly discloses generated KNN scores which are then stored in the hit- 
list for the purposes of categorization (a purpose other than potentially duplicate 
document detention), and as stated in the rejection of independent claim 1 (on which 
claims 3 and 4 depend on) Pugh teaches that these scores may then be used for 
potentially duplicate document detection (column 7, line 26-column 8, line 28 of Pugh). 
Thus, as cited the rejection properly teaches the claimed limitations of claims 3 and 4. 

(11) Related Proceeding(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 



For the above reasons, it is believed that the rejections should be sustained. 
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Respectfully submitted, 
Joshua Campbell 



Stephen hong 
supervvsory patent examiner 
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